Accurate human microsatellite genotypes from high-throughput resequencing data using informed error profiles

نویسندگان

  • Gareth Highnam
  • Christopher Franck
  • Andy Martin
  • Calvin Stephens
  • Ashwin Puthige
  • David Mittelman
چکیده

Repetitive sequences are biologically and clinically important because they can influence traits and disease, but repeats are challenging to analyse using short-read sequencing technology. We present a tool for genotyping microsatellite repeats called RepeatSeq, which uses Bayesian model selection guided by an empirically derived error model that incorporates sequence and read properties. Next, we apply RepeatSeq to high-coverage genomes from the 1000 Genomes Project to evaluate performance and accuracy. The software uses common formats, such as VCF, for compatibility with existing genome analysis pipelines. Source code and binaries are available at http://github.com/adaptivegenome/repeatseq.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

popSTR: population-scale detection of STR variants

Motivation Microsatellites, also known as short tandem repeats (STRs), are tracts of repetitive DNA sequences containing motifs ranging from two to six bases. Microsatellites are one of the most abundant type of variation in the human genome, after single nucleotide polymorphisms (SNPs) and Indels. Microsatellite analysis has a wide range of applications, including medical genetics, forensics a...

متن کامل

megasat: automated inference of microsatellite genotypes from sequence data.

megasat is software that enables genotyping of microsatellite loci using next-generation sequencing data. Microsatellites are amplified in large multiplexes, and then sequenced in pooled amplicons. megasat reads sequence files and automatically scores microsatellite genotypes. It uses fuzzy matches to allow for sequencing errors and applies decision rules to account for amplification artefacts,...

متن کامل

Assessing genetic diversity of promising wheat (Triticum aestivum L.) lines using microsatellite markers linked with salinity tolerance

Narrow genetic variability may lead to genetic vulnerability of field crops against biotic and abiotic stresses which can cause yield reduction. In this study a set of 37 wheat microsatellite markers linked with identified QTLs for salinity tolerance were used for the assessment of genetic diversity for salinity in 30 promising lines of hexaploid bread wheat (Triticum aestivum L.). A total of 4...

متن کامل

MAGERI: Computational pipeline for molecular-barcoded targeted resequencing

Unique molecular identifiers (UMIs) show outstanding performance in targeted high-throughput resequencing, being the most promising approach for the accurate identification of rare variants in complex DNA samples. This approach has application in multiple areas, including cancer diagnostics, thus demanding dedicated software and algorithms. Here we introduce MAGERI, a computational pipeline tha...

متن کامل

Maximum-likelihood estimation of allelic dropout and false allele error rates from microsatellite genotypes in the absence of reference data.

The importance of quantifying and accounting for stochastic genotyping errors when analyzing microsatellite data is increasingly being recognized. This awareness is motivating the development of data analysis methods that not only take errors into consideration but also recognize the difference between two distinct classes of error, allelic dropout and false alleles. Currently methods to estima...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 41  شماره 

صفحات  -

تاریخ انتشار 2013